AD-A104  720 
UNCLASSIFIED 


SOUTHERN  METHODIST  UNIV  DALLAS  TEX  DEPT  OF  STATISTICS  F/G  12/1 

A  REGRESSION  DESIGN  APPROACH  TO  OPTIMAL  AND  ROBUST  SPACING  SELE— ETC(U) 
JUL  81  R  L  EUBANK  N00014-75-C-0439 

TR-144  NL 


fiBfUm  AD  AI  0  4728 


81  9  2  8  2  4l 


A  REGRESSION  DESIGN  APPROACH  TO  OPTIMAL  AND 
ROBUST  ^SPACING ^ELECTION , 

by 

I  Randall  L./ Eubank 

^  Technical  Rep**r,No .  144  / 
Department  of" Statistics  ONR  Contract 


i 

) 


/  j  Jul|«sil981 


Research  sponsored  hy  the  Office  tyf  Naval  Research 
Contract/  NOOO14-75-C-0439j 


Reproduction  in  whole  or  in  part  is  permitted 
for  any  purpose  of  the  United  States  Government 


This  document  has  been  approved  for 
public  release  and  sale;  its  distribution  is  unlimited 


DEPARTMENT  OF  STATISTICS 
Southern  Methodist  University 
Dallas,  Texas  75275 


Accession  For 
NTIS  GRAM 
OTIC  TAB 
Unannounced 
Justification— 


^Distribution/ 

_ Availability  Codes 

jAvail  and/or 
Diet  |  Special 


A  Regression  Design  Approach  to  Optimal  and 
Robust  Spacing  Selection 

By  R.  L.  Eubank 

Short  title:  Spacing  Selection 

Summary.  -  The  problem  of  location  and/or  scale  parameter  estimation 
using  the  asymptotically  best  linear  unbiased  estimator  based  on  sample 
quantiles  is  considered.  The  problem  of  optimal  spacing  selection  for 
these  estimators  is  shown  to  be  equivalent  to  the  problem  of  regression 
design  for  time  series  with  Brownian  motion  or  Brownian  bridge  covariance 
structures  and  a  particular  variable  knot  spline  approximation  problem. 
This  equivalence  is  employed,  in  conjunction  with  a  regression  framework, 
to  investigate  the  asymptotic  properties  of  certain  spacing  selection 
schemes.  In  particular,  an  asymptotic  alternative  is  developed  to  a 
robust  estimation  procedure  suggested  by  Chan  and  Rhodin  (1980) . , 

1 .  Introduction ■  In  a  location  and  scale  parameter  model  it 

is  assumed  that  a  random  sample  X, _ _ ,X„  is  obtained  from  a  distribu- 

1  N 

tion  of  the  form 

F(x)  =  Fq )  * 

where  FQ  is  a  known  distributional  form  and  g  and  o  are, respectively,  a 
location  and  scale  parameter.  Usually  g  and/or  a  are  unknown  and  must 
be  estimated  from  the  data.  In  this  paper,  the  properties  of  the  asymp¬ 
totically  best  linear  unbiased  estimators  (ABLUE's)  of  g  and  a  based 
on  n  <  N  sample  quantiles  will  be  investigated. 


The  ABLUE  is  an  easily  computed  estimator  which  derives  from  the 

asymptotic  distribution  of  the  sample  quantiles  and  was  first  suggested 

by  Mosteller  (1946) .  Further  properties  and  computational  formulas 

were  latter  derived  by  Ogawa  (1951).  In  particular,  Ogawa  obtained 

explicit  expressions  for  the  asymptotic  relative  efficiency  (ARE)  of 

* 

the  ABLUE  with  respect  to  the  Cramer-Rao  lower  variance  bound  for 
unbiased  parameter  estimation.  The  ABLUE  has  also  received  attention 
in  the  context  of  robust  estimation  due  to  the  work  of  Chan  and 
Rhodin  (1980) . 

Define  the  sample  quantile  function,  Q,  by 

Q(u)  =  X  },  <  u  <  i  ,  j  =  1, . . .  ,N,  (1.1) 

where  denotes  the  jth  sample  order  statistic.  Then,  for  any  set 

of  real  numbers  0  <  u,  <  . . .  <  u  <1,  the  ABLUE' s  of  y  and  a  will  have 
the  form  E?_jb(u^)Q(uJ  (c.f.  Ogawa  (1951)  or  Eubank  (1979)  for  explicit 
expressions  for  the  bfuj  in  the  various  estimation  situations) .  Since 
the  ABLUE  uses  a  subsample  of  n  <  N  of  the  sample  quantiles  or  order 
statistics,  the  quantiles  which  are  utilized  in  the  estimator,  or  equi¬ 
valently  their  spacing,  u^,...,un,  must  be  chosen  appropriately.  The 
problem  of  optimally  selecting  the  u^  i=l, — ,n,  has  classically  been 
termed  the  optimal  spacing  problem  and  has  been  addressed  by  Bloch  (1966) , 
Balmer,  Boulton  and  Sack  (1974) ,  Chan  (1970) ,  Chernoff  (1971) ,  Eisenberger 
and  Posner  (1965),  Gupta  and  Gnanadesikan  (1966),  Hassanein  (1968,  1969a, 
1969b,  1971,  1972,  1977),  Kulldorf  (1963),  Kulldorf  and  Vannman  (1973), 
Rhodin  (1976),  Sarhan  and  Greenberg  (1958,  1962)  and  Sarndal  (1962,  1964). 

The  usual  approach  to  the  optimal  spacing  problem  has  been  to  attempt 
to  find  a  spacing  which  corresponds  to  the  maximum  value  for  one  of  the 


ARE  expressions  given  by  Ogawa.  Whereas  this  is  usually  a  straight¬ 
forward,  albeit  tedious,  numerical  problem  in  the  event  of  a  single 
unknown  parameter,  the  case  when  both  parameters  are  unknown  usually 
proves  to  be  both  analytically  as  well  as  numerically  intractable. 

This  latter  fact  has  led  to  the  use  of  "suboptimal"  spacing  selection 
schemes  such  as  the  selection  of  a  spacing  which  maximizes  the  sum  of 
the  two  ARE's  of  the  estimators  or  a  spacing  which  minimizes  the  sum  of 
the  estimators'  variances.  This  type  of  approach  has  been  employed  by 
Eisenberger  and  Posner  (1965)  and  Hassanein  (1969a,  1969b,  1977) . 

In  this  paper  the  asymptotic  (as  n  -*■  °°)  properties  of  optimal  as 
well  as  suboptimal  spacing  selection  schemes  are  derived  and  then 
utilized  to  obtain  an  analytic  approach  to  robust  estimation  problems 
such  as  that  considered  by  Chan  and  Rhodin  (1980) .  These  results  are 
obtained  using  the  regression  analysis  framework  developed  in 
Parzen  (1979).  When  viewed  in  a  regression  setting,  so  called  suboptimal 
spacings  which,  for  instance,  minimize  the  sum  of  the  variances  are  seen 
to  be  well  motivated  from  the  perspective  of  regression  design  as  well  as 
for  other  reasons  such  as  computational  simplicity. 

The  asymptotic  implication  of  several  spacing  selection  criteria 
are  considered  in  Section  3.  The  regression  framework  from  which  these 
results  derive  is  presented  in  Section  2  where  it  is  shown  that  the 
problem  of  optimal  spacing  selection  can  be  viewed  as  both  an  optimal 
regression  design  problem  and  a  variable  knot  spline  approximation 
problem  for  splines  of  order  1.  This  fact  permits  the  derivation  of 
the  asymptotic  results  in  Section  3.  Finally,  in  Section  4,  the  results 
of  Section  3  are  utilized  to  develop  a  general  approach  to  a  problem 
of  robust  ABLUE  construction. 
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2.  Regression  design,  spacing  selection,  and  variable  knot  spline 

approximation .  The  objective  of  this  section  is  to  show  the  equivalence 

of  three  problems:  the  optimal  spacing  problem,  the  problem  of  regression 

design  in  the  presence  of  errors  with  Brownian  bridge  (or  Brownian  motion) 

2 

covariance  structure,  and  the  problem  of  finding  the  best  L  [0,1]  approxi¬ 
mation  of  a  particular  function  by  piecewise  constants  with  free  break¬ 
points.  First,  however,  a  few  preliminaries  are  required. 

Now,  and  in  subsequent  discussions,  it  will  be  assumed  that  FQ  is 
absolutely  continuous  with  associated  density  (where:  =  means 

"is  defined  as").  The  quantile  function  corresponding  to  FQ  is  Qq(u):= 

F„1(u):  ■  inf{x|F„(x)  >  u}  and  the  density-quantile  function  is  defined 
o  x  u  —  — - 

as  dQ  (u)  :  =  f  (Q0  (u) )  ,  0  <_  u  <_  1 . 

Parzeu  (1979)  has  shown  that  for  large  N  the  problem  of  location 
and/or  scale  parameter  estimation  can  be  considered  as  a  continuous 
parameter  time  series  regression  problem  through  use  of  the  model 

dQ(u)Q(u)  =  ydQ(u)  +  odg (u)Qq (u)  +  CgBfu)  ,  u  e  [0,1]  ,  (2.1) 

where  o  =  ct//n  and  B(*)  is  a  Brownian  bridge  process,  i.e.,  B(*)  is  a 

D 

zero  mean  normal  process  with  covariance  kernel 

R(u^,u2)  =  u^  -  UjUj'  0  i  ^  <  u2  <  1  .  (2.2) 

Consequently,  under  the  regularity  condition  that  both  dQ  and  the 
product  of  dQ  and  Qq,  d0*QQ,  a*e  in  the  reproducing  kernel  Hilbert 
space  (RKHS)  generated  by  R,  the  techniques  developed  by  Parzen  (1961a, 
1961b)  may  be  utilized  to  construct  linear  estimators  of  y  and  o  which 

are  based  on  the  entire  set  of  N  sample  quantiles.  Denote  these  estimators 

A  '  2  -1 
by  y  and  a.  Their  correspondina  variance-covariance  matrix  is  a  A  , where 

A  is  the  usual  intrinsic  accuracy  matrix  associated  with  the  location  and 

scale  parameter  model  (2.1) . 


i 


i 
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The  problem  of  optimal  regression  design  selection  for  model  (2.1) 
is  also,  clearly,  a  problem  of  optimal  quantile  selection.  It  is,  in 
fact,  the  optimal  spacing  problem.  To  see  this  define  the  set  of  all 
possible  n-point  designs  for  model  (2.1)  by 

D  :  =  {(u,,...,u  ) |o  <  u,  <  u.  <  ...  <  u  <  1}  . 
n  1  n  1  2  n 

Given  a  particular  design,  U  =  {u.,...,u  }  e  D  ,  the  observation  set 

i  n  n 

{d-(Uj)Q(u^)  , . . .  ,dg(un)Q(un)  }  may  be  utilized,  as  a  result  of  model 

(2.1),  to  construct  estimators,  y(U)  and  o(U),  for  y  and  c  through 

the  use  of  generalized  least  squares.  Denote  the  variance-covariance 

2  -1 

matrix  of  these  estimators  by  a A(U)  .  It  has  been  noted  by  Eubank  (1981) 

B 

that  y(U)  and  a (U)  coincide  with  the  ABLUE's  for  y  and  a  based  on  the 
2  -1 

spacing  U  and  that  a  A(u)  coincides  with  their  asymptotic  variance- 
B 

covariance  matrix.  Since  the  ARE  expression  for  simultaneous  parameter 
estimation  given  by  Ogawa  (1951)  is 

ARE (y (U) , a (U) )  =  |a(U)|/|a|  , 

where  |*|  denotes  the  determinant  function,  it  is  now  apparent  that  the 
optimal  spacing  problem  is  identical  with  the  problem  of  D-optimal  design 
selection  for  model  (2.1).  The  criterion  of  minimizing  the  sum  of  the 
estimators'  variances  is  now  recognized  as  A-optimal  design  selection 
since 

2  -1 

a  trA(U)  =Var(y(U))  +Var(a(U)), 

B 

where  tr  denotes  the  matrix  trace.  Similarly,  maximizing  the  sum  of 
the  ARE's  is  equivalent  to  maximizing  tr[A(U)M],  where  M  1:=diag(a11,a12) 
and  a.^  i*l,2,  denote  the  diagonal  elements  of  A. 

In  the  case  of,  for  instance,  D-optiraal  designs  for  model  (2.1)  it 
suffices  to  maximize  |a(U) |  over  U  e  D^.  It  should  be  noted  that  such  a 
design  is  also  D-optimal  for  the  regression  model 
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Y(t)  *  61dQ(t)  +  62dQ(t)Q0(t)  +  X (t) ,  te[0,l] ,  (2.3) 

where  X(*)  is  a  Brownian  bridge  process.  Similar  remarks  hold  for  other 

optimality  criteria  and  for  the  case  of  only  one  unknown  parameter.  Thus, 

if  u(o)  is  known  an  optimal  spacing  for  estimating  o(y)  is  also  an  optimal 

design  for  the  estimation  of  82(B.^)  w^en  B^(82)  known* 

The  problem  of  optimal  design  (and  hence  optimal  spacing) 

selection  may  be  analyzed  using  the  RKHS  approach  developed  by  Sacks 

and  Ylvisaker  (1966,  1968).  Therefore,  let  H (R)  denote  the  RKHS 

generated  by  R  in  (2.2)  with  associated  norm  denoted  by  ||*||  .  It 

can  be  shown  (c.f.  Parzen  (1979))  that 

H(R)  =  {f (f (0)=f (1)=0,  f'eL2[0,l]}  . 

The  inner  pioduct  of  f,g  e  H(R)  is 

<f,g>T>^f1f'Wg'(x)dx=  <f',g1>2  ,  (2.4) 

R  0  L 

2 

where  <•,•>2  denotes  the  usual  L  [0,1]  inner  product.  It  now  follows 
L 

from  the  work  of  Sacks  and  Ylvisaker  (1968)  that  the  matrices  A  and  A(U) 
associated  with  the  estimators  (y,a)t  and  (y(U),  o (U) ) t  respectively, 


R^:=  span{R(*  .u^^)  |ui  c  U> 


Equations  (2.4)  -  (2.6)  have  important  implications  for  the 


optimal  spacing  problem.  To  illustrate  this  point  consider  the  case 

of  location  parameter  estimation  when  a  is  assumed  known.  For  a 

2  ~1 

particular  spacing,  U,  since  a  A(U)  is  the  asymptotic  variance- 

6 

covariance  matrix  of  the  ABLUE  it  follows  that 


ABE(u(U)). 


IK",  ii 


l-o'l  l- 


as  a  result  of  the  Pythagorean  theorem.  Thus,  maximizing  ARE(y(U)) 


with  respect  to  U  is  equivalent  to  minimizing  | |d  -R  d  | |  over  all 

U  U  0  R 

U  e  D  .  However,  from  (2.4) , 
n 


IWo"* 


2- 


- 1  K-w  &  . 


2  2 
where  R^  is  the  L  10,1]  orthogonal  projection  operator  for  the  (L  ) 

subspace 

3R(u,u. ) 

=  span{  - |  u.  £  O}  .  (2.7) 

Reference  to  (2.2)  verifies  that  R^  consists  of  splines  of  order  1 

(piecewise  constants) with  knots  or  breakpoints  at  the  elements  of  U. 

Therefore,  the  optimal  spacing  problem  is  now  seen  to  coincide  with  the 

following  variable  knot  spline  approximation  problem  for  d1 :  find  U*eD 

on 

such  that 


=  inf  | |d' 


UeD 


(2.8) 


n 

To  ascertain  how  problem  (2.8)  relates  to  the  usual  type  of  variable 

knot  piecewise  constant  approximation  problem  first  note  that  for  any 

2 

UeD^  the  elements  in  R^  are  orthogonal  to  the  unit  function,  1,  in  L  [0,1] 


Thus,  the  set  of  all  splines  of  order  1  with  knots  at  U,  S  ,  may  be 


written 
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Sy  =  spanU}®  R^  . 


As  dQ  e  H (R)  requires  that  d0(0)=d0(l)=0  it  follows  that  J_  span  { 1 } , 

2 

as  well.  Therefore,  d^-R^d^  -1-  sy  L  >  and,  consequently, R^d^  is 
2 

the  best  L  10,1]  approximation  to  d^  from  Sy.  Now,  let  U*  be  defined  as 

2 

in  (2.8)  and  let  denote  the  L  [0,1]  orthogonal  projector  for  Sy. 

Then  R^d^  satisfies 


I  Will  1.2  ■  inf||d;-s0ail|l2 


UeD 


(2.9) 


n 

Equation  (2.9)  has  the  consequence  that,  for  location  parameter 

estimation,  the  optimal  spacing  problem  is  equivalent  to  finding  the 

2 

best  set  of  knots  for  piecewise  constant  approximation  of  d^  in  L  [0,1]. 
an  analogous  result  holds  for  scale  parameter  estimation. 

The  preceding  discussions  are  now  summaried  by  way  of  the  following 
theorem. 


Theorem  1 .  If  ^q^o'^O^  *s  in  then  the  following  three  problems 

are  equivalent: 

(i)  Optimal  spacing  selection  for  the  ABLUE  of  u(oj  when  a(u) 
is  known. 

(ii)  Minimum  variance  design  selection  for  8^(82)  when  S2  ( )  is 
known  in  model  (2.3). 

2 

(iii)  Optimal  knot  selection  for  the  best  L  [0,1]  approximation  of 
d-<[d0.20]’>  splines  of  order  1. 


It  is  of  interest  to  note  the  importance  of  Theorem  1  with  regard 
to  problems  (ii)  and  (iii) .  From  a  regression  design  perspective  the 
values  of  optimal  spacings  provided  in  references  [1,3,4,5,10-14,17,18,25,28,29] 
may  now  be  viewed  as  optimal  designs  for  a  regression  problem  with  regression 
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function  and  Brownian  bridge  covariance  structure  (these  are 

also  optimal  designs  for  models  having  the  Brownian  motion  covariance 
kernel,  min(s,t),  if  a  design  point  at  1  is  appended)  whereas  from  an 
approximation  theory  point  of  view  they  may  be  considered  as  optimal 
knot  locations  for  piecewise  constant  approximation  of  do^do"®0^'^’  The 
optimal  spacing  literature ,  therefore ,  provides  a  readily  available 
source  for  the  optimal  designs  (in  the  context  of  model  (2.3))  and  optimal 
knots  which  correspond  to  a  rich  set  of  functions.  For  this  reason,  it 
should  be  of  considerable  value,  for  comparison  or  other  purposes,  when 
alternative  design  or  knot  selection  schemes  are  being  considered. 

3.  Asymptotic  results.  In  this  section  the  asymptotic  properties  of 
certain  spacing  selection  schemes  will  be  analyzed.  It  will  be  seen 
that,  in  certain  cases,  it  is  possible  to  characterize  the  asymptotic 
behaviour  of  spacing  sequences  with  regards  to  various  criteria  for 
measuring  the  size  of  A(U) .  In  addition,  spacing  sequences  that  are 
asymptotically  optimal  (in  a  sense  to  be  defined)  for  the  optimality 
criteria  |a(U)|,  V(y(U))  +  V(c(U))  and  ARE  (u(U))  +  ARE  (a  (U) )  will  be 
provided.  The  elements  of  such  sequences  can  be  utilized  to  provide 
an  approximate,  easily  computed,  solution  to  the  problems  of  optimal 
and  suboptimal  spacing  selection. 

For  a  nonnegative  matrix  B,  let  *Jj(B)  denote  either  |b|  or  trBM 

where  M  is  a  specified  nonnegative  matrix.  Then  the  performance  of  a 

00 

spacing  sequence,  {U  }  ,  U  e  D  ,  can  be  determined  from  a  regret 

n  n*l  n  n 

point  of  view  by  examining  the  asymptotic  behaviour  of  (a)-i(j  (A  (un) ) 
or  ij)(A(U  )  S  -  i|/(A  S  •  A  sequence  satisfying 


10 


limtinf  i,  (A (U)  _1) -(j,  (A_1)  ]  [tHA (U  )_1) -j)  (A_1)  ]  _1  =  1  (3.1) 

n-w>  UeD 

n 

is  termed  asymptotically  ijil-optimum  whereas  one  satisfying 

lim[iMA)  -  sup  iHA(U)  )  ][iMA)-^(A(U  ))]_1  =  1  (3.2) 

n-«»  UeD  n 

n 

is  said  to  be  asymptotically  ij)2-optimum  (this  terminology  is  due  to  Sacks 

and  Ylvisaker  (1968)).  When  only  one  parameter  is  unknown  both  (3.1)  and 

(3.2)  may  be  stated  in  terms  of  ARE's.  If,  for  instance,  only  y  is 

unknown  both  (3.1)  and  (3.2)  are  equivalent  to 

lim[l-  sup  ARE  (u(U))]  [1-ARE  (y(U  ))]_1  =  1  .  (3.3) 

n-*”  ,,  n 

UeDn 

As  in  Eubank  (1981) ,  density  functions  will  be  utilized  to  generate 

spacing  sequences.  Let  k  be  a  continuous  density  on  [0,1]  with  associated 

quantile  function  k.  Then  k,  or  equivalently  tc ,  defines  a  spacing  sequence 

whose  n-th  element  is  U  ={<( — —  ),...,<(  — ^-r-  )K  This  sequence  is  called 

n  n+1  n+1 

the  regular  sequence  (RS)  generated  by  k  or  <  and  this  relationship  is 

indicated  by  {un}  is  RS(<).  Since  k,  or  at  least  some  of  its  values,  must 

be  known  in  order  to  obtain  the  U  most  of  the  results  which  follow  will  be 

n 

stated  in  terms  of  <  rather  than  k  (to  translate  conditions  on  «r  to  condi¬ 
tions  on  k  it  is  only  necessary  to  employ  the  change  of  variable  x  =  <(u) 
and  the  relation  k'(u)  =  l/k(x(u))).  In  the  context  of  knot  (design) 
selection,  such  k's  are  frequently  termed  knot  density  functions  and  <  is 

referred  to  as  a  knot  quantile  function. 

2  2 

Let  W  '  (a,b)  denote  the  Sobolev  space  of  functions  on  (a,b) 

2 

possessing  a  second  distribution  derivative  in  L  [a,b]  and  define  the 

2  2  2  2 
space  W  '  (0,1)  as  the  set  of  all  functions  in  W  '  (a,b)  if  0  <  a  <  b  <  1. 
loc 

The  next  theorem  details  the  asymptotic  behaviour  of  spacing  sequences  for 
the  trace  and  determinant  criteria.  The  primary  emphasis  is  upon  spacings 
generated  by  a  knot  or  spacing  quantile  function.  In  particular,  knot 
density  functions  which  generate  asymptotically  optimal  spacing  sequences 


are  provided. 
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Theorem  2.  Let  tc  e  C[Ofl]  be  a  knot  quantile  function  with  a  bounded 

piecewise  continuous  derivative  *c '  having  the  property  that  the  set  of 

all  points  where  <’  is  zero  or  discontinuous  has  content  zero  and  has 

neither  0  or  1  as  accumulation  points.  Using  g  to  denote  either  dQ  or 

2  2 

d  *Q_,  it  is  assumed  that  geW  '  (0,l)flH(R)  and  that  for  each  function 
u  o  loc 

there  is  a  corresponding  6  >  0  and  a  monotone  function  h  on  I=(0,6]U  [1-6,1] 
which  satisfies 

h(x)  >_  |g"  (x)  |  for  all  x  e  I,  (3.4) 

and 

/|h(x(x) ) ! 2< *  (x)3dx  <  “  .  (3.5) 

I 

Under  these  hypotheses  the  following  results  hold. 

(i)  Let  iJ>(B)  =  trBM  and  define 

<Mu):  =  (d£(u),  (d0-Q0)*'(u))t.  (3.6) 


Then,  if  (U  }  is  RS(tc) 

n  1 

lim  n2(trAM  -  trA(U  )M}  =  -r~  f[<J>  (x  (x) )  fcM4>  (<  (x) )  ] ic •  (x)  3dx.  (3.7) 

n  12  >0 

2 

If  dQ  and  are  in  C  [0,1]  OH (R)  then  the  RS  generated  by  the  density 


k*  (x)  =  [ 4*  (x)  tM$  (x)  ] 1/3 


[<Ms)Ws)]1/3ds> 


(3.8) 


is  asymptotically  i|i2-optimum.  In  addition,  if  M  is  positive,  the  RS 


generated  by  the  density 

kMx)  =  [<Hx)V1MA_Vx)]1/3 


[♦(s)tA"1MA~1<t)(s)  ]1/3ds} 


(3.9) 


is  asymptotically  ij/l-optimum. 

(ii)  Let  (B)  =  | B |  .  Then,  if  {U  }  is  RS(k) 


n 

1  rl 


lim  n2  [a|  (l-ARE  (u  (U  ),a(U  ))}  =  [4>  (k  (x) )  fcA  ^  (x  (x)  )  ]k  *  (x)  Jdx.  (3.10) 
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2 

If  both  dQ  and  dQ#C0  are  in  C  [0tl]DH(R)  then  a  RS  which  is  asymptotically 
ipl  and  i (>2  optimum  is  generated  by  the  density 

k*  (x)  -  [<j»(x)V14i(x)]1/3//  {/[<(>  (s)  V1<t.(s)]1/3ds}  . 


ill'TiaMltoaruM  «  . laaMSati 1L 


(3.11) 


Proof.  The  asymptotic  optimality  of  the  sequences  generated  by  (3.8) 


(3.9),  and  (3.11)  is  an  isinediate  consequence  of  results  given  in  Sacks 
and  Ylvisaker  (1968).  To  obtain  (3.7)  and  (3.10)  first  note  that  under 
the  present  assumptions  the  work  of  Pence  and  Smith  (1981)  (or  Barrow  and 
Smith  (1978)  under  stronger  conditions  on  tc)  in  conjunction  with  Theorem  1 
has  the  consequence  that 

lim  n2|  jg-R  g|  |2  =  J[g" (k (x) )  ] 2ic 1  (x)  3dx,  (3.12) 

rr*“  n  R  0 

where  g  has  been  utilized  to  denote  either  dQ  or  ^q'Qq*  Equation  (3.12), 
along  with  the  technique  utilized  in  proving  Theorems  4.1  and  4.2  in  Sacks 
and  Ylvisaker  (1968) ,  then  gives  the  desired  results. 

Theorem  2  provides  the  necessary  tools  for  analyzing  the  asymptotic 
behaviour  of  the  various  spacing  selection  schemes.  Using  (3.11)  in  (3.10) 
and  the  asymptotic  optimality  of  the  corresponding  KS  it  follows  that  for 
spacings  obtained  by  maximizing  Ogawa's  ARE  expression 

lim  n2  |a|  {1-sup  ARE  (u(U)  ,o(U) )  }  =  Yjt/1  ^  1/,3(ix^3  (3.13) 

n-*»  UeD  0 

n 

This  is  to  be  compared  to  the  approach  of  maximizing  the  sum  of  the  ARE's 
utilized  by  Hassanein  (1977).  In  this  latter  case,  using  (3.7)  and  (3.8) 


with  M-1  *  diag(||d0||2,  I id0*20l one  obtains 


Finally,  if  the  criteria  is  minimization  of  the  sum  of  variances,  as  in 
Hassanein  (1969a,  1969b)  and  Eisenberger  and  Posner  (1965),  then  from 
(3.9)  and  Bieorem  4.5  of  Sacks  and  Ylvisaker  (1968)  with  M  *>  I,  the 
limiting  behaviour  is  given  by 

lim  n2{inf  trA(U)-1-trA_1}  -  [<f>  (x)  V2*  (x)  ]1/3dx}3  .  (3.15) 

n-*»  UeD  0 

n 

Thus,  for  symmetric  distributions,  where  A  is  a  diagonal  matrix,  spacings 


which  are  D-optimal  or  maximize  ARE(y(0))  +  ARE(c(U))  will  have  the  same 
asymptotic  behaviour.  However,  the  asymptotic  properties  of  these 
spacings  will,  in  general,  differ  from  that  of  spacings  obtained  by 
minimizing  the  sum  of  the  variances,  even  for  symmetric  distribution. 

All  three  spacing  selection  schemes  will  behave  similarly  for  distri¬ 
butions  such  as  the  Cauchy  where  A  is  a  constant  multiple  of  the  identity. 

In  fact,  for  the  Cauchy  distribution  asymptotically  optimal  spacing 
sequences  for  minimization  of  |a(U)|,  V(y(U))  +  V(a(U)),  and  ARE(y(U)) 

+  ARE (a (U) )  are  all  generated  by  the  same  knot  quantile  function,  ie*(x)=x. 

The  density  (3.11)  was  also  derived  for  use  in  spacing  selection  by 
Samdal  (1962)  using  variational  methods  and  under  more  stringent  conditions. 
For  an  alternative  approach  see  Eubank  (1981) . 

4.  Robust  estimation.  Chan  and  Rhodin  (1980)  have  proposed  a 
technique  which  utilizes  the  ABLUE  to  accomplish  the  robust  estimation 
of  the  location  parameter  of  a  symmetric  distribution.  They  assume  that 
the  true  underlying  distribution  is  a  member  of  the  Tukey’s  lambda  family 
or  is  a  normal,  double  exponential  or  Cauchy  distribution  (or,  at  least, 
is  well  modelled  by  one  or  more  of  these  laws).  In  addition,  they  assume 
that  Fg  belongs  to  a  known  finite  subset,  L,  of  these  families  of  distri¬ 
butions.  Let  ARE  (y  (U)]  G)  denote  the  ARE  corresponding  to  the  spacing  U 
when  estimation  is  to  be  accomplished  under  the  assumption  that  the  data 
has  the  distribution  G  e  L.  Then,  to  estimate  y,  Chan  and  Rhodin  take 
as  their  guess  for  FQ  any  distribution  F*  e  L  which  satisfies 

min  ARE(y (U(F*) ) jG)  =  max  min  ARE (y (0(F) ) [G) ,  (4.1) 

Get.  FeL  GeL 


where  U(F)  is  an  optimal  spacing  for  the  distribution  F.  This  approach 


requires  that  the  function  ARE(u(U(F) ) jG)  must  be  tabulated  for  all  pairwise 
combinations  of  laws  in  L  and  for  each  value  of  n  that  is  to  be  considered 
(Cham  and  Rhodin  provide  tables  for  n  =  2(1)5). 

From  the  results  in  Section  2  it  follows  than  the  procedure  utilized 
by  Chan  and  Rhodin  (1980)  is  equivalent  to:  select  an  F *  e  L  such  that 

HdG-RU(F*)dGllR  =  mi?  HVW^g'Ir  '  (4-2) 

GeL  FeL  Get 


where  d_  is  the  density-quantile  function  for  GeL .  This  suggests  the 
G 

following  asymptotic  approach  to  robust  spacing  selection.  Under  condi¬ 
tions  such  as  those  in  Theorem  3.1,  define  for  each  FeL  the  knot  density 
function 


k  (x)  =  {d"(x)}2/3//1{d"(s)}2/3ds 
F  F  0  F 


(4.3) 


2 

with  corresponding  knot  quantile  function  denoted  If  dQ  £  C  [0,1] 

the  density  k?  generates  a  sequence  of  asymptotically  optimal  (in  the 

sense  of  (3.3))  spacings,  {U  (F)},  for  location  parameter  estimation  when 

n 

F  is  the  true  parent  distribution  of  the  data  (c.f.  Eubank  (1981)). 

Then,  from  (3.12) 


ldG~RU  (F) dG^ 


Ir 


/  [d"(<  (x))]2ic’  (x)3dx  +  o (n  2) . 
IGF  F 

0 


Hence,  an  asymptotic  version  of  (4.1)  is:  select  F*  e  L  so  that 


(4.4) 


max  /  [d"(K  (x) )  ]2k'  (x)  3dx  =  min  max  /  [d"  (<_ (x) )  ]2ic’ (x)  3dx.  (4.5) 

GeL  0  G  F  F*  FeL  GeL  0  G  F  F 

To  determine  F*  for  a  given  L  one  must  evaluate  (usually  by  numerical 

1  2  3 

techniques)  the  function  f[d"(ic  (x) )  ]  ic'(x)  dx  for  all  pairwise  combina- 

0  G  F  f 

tions  of  laws  in  L.  However,  in  contrast  to  the  procedure  suggested  by 


Chan  and  Rhodin,  the  resulting  tabulation  suffices  for  all  values  of 
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n.  An  estimator  of  u  based  on  n  quantiles  is  then  provided  through  use 

of  the  spacing  U  (F*)  and  the  corresponding  coefficients  for  F*  that 
n 

may  be  obtained  from  Ogawa  (1951) . 

The  solution  (4.5)  is  applicable  to  any  (finite)  set  of  laws  L, 
whether  symmetric  or  not,  provided  conditions  such  as  those  in  Theorem  2 
are  satisfied  by  the  elements  of  L.  A  scale  parameter  version  of  (4.5) 
can  be  obtained  by  using  d*Q  rather  than  d,  in  (4.3)  and  (4.5).  For 
the  simultaneous  estimation  of  y  and  a  either  (3.6)  and  (3.8)  or  (3.10) 
and  (3.11)  can  be  utilized  to  construct  an  analogous  criterion  for 
robust  spacing  selection. 
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